Crate object_store
source · [−]Expand description
object_store
This crate provides a uniform API for interacting with object storage services and
local files via the the ObjectStore
trait.
Create an ObjectStore
implementation:
- Google Cloud Storage:
GoogleCloudStorageBuilder
- Amazon S3:
AmazonS3Builder
- Azure Blob Storage::
MicrosoftAzureBuilder
- In Memory:
InMemory
- Local filesystem:
LocalFileSystem
Adapters
ObjectStore
instances can be composed with various adapters
which add additional functionality:
- Rate Throttling:
ThrottleConfig
- Concurrent Request Limit:
LimitStore
Listing objects:
Use the ObjectStore::list
method to iterate over objects in
remote storage or files in the local filesystem:
use std::sync::Arc;
use object_store::{path::Path, ObjectStore};
use futures::stream::StreamExt;
// create an ObjectStore
let object_store: Arc<dyn ObjectStore> = Arc::new(get_object_store());
// Recursively list all files below the 'data' path.
// 1. On AWS S3 this would be the 'data/' prefix
// 2. On a local filesystem, this would be the 'data' directory
let prefix: Path = "data".try_into().unwrap();
// Get an `async` stream of Metadata objects:
let list_stream = object_store
.list(Some(&prefix))
.await
.expect("Error listing files");
// Print a line about each object based on its metadata
// using for_each from `StreamExt` trait.
list_stream
.for_each(move |meta| {
async {
let meta = meta.expect("Error listing");
println!("Name: {}, size: {}", meta.location, meta.size);
}
})
.await;
Which will print out something like the following:
Name: data/file01.parquet, size: 112832
Name: data/file02.parquet, size: 143119
Name: data/child/file03.parquet, size: 100
...
Fetching objects
Use the ObjectStore::get
method to fetch the data bytes
from remote storage or files in the local filesystem as a stream.
use std::sync::Arc;
use object_store::{path::Path, ObjectStore};
use futures::stream::StreamExt;
// create an ObjectStore
let object_store: Arc<dyn ObjectStore> = Arc::new(get_object_store());
// Retrieve a specific file
let path: Path = "data/file01.parquet".try_into().unwrap();
// fetch the bytes from object store
let stream = object_store
.get(&path)
.await
.unwrap()
.into_stream();
// Count the '0's using `map` from `StreamExt` trait
let num_zeros = stream
.map(|bytes| {
let bytes = bytes.unwrap();
bytes.iter().filter(|b| **b == 0).count()
})
.collect::<Vec<usize>>()
.await
.into_iter()
.sum::<usize>();
println!("Num zeros in {} is {}", path, num_zeros);
Which will print out something like the following:
Num zeros in data/file01.parquet is 657
Modules
An object store implementation for S3
An object store implementation for Azure blob storage
An object store implementation for Google Cloud Storage
An object store that limits the maximum concurrency of the wrapped implementation
An object store implementation for a local filesystem
An in-memory object store implementation
Path abstraction for Object Storage
A throttling object store wrapper
Structs
Exponential backoff with jitter
Result of a list call that includes objects, prefixes (directories) and a token for the next set of results. Individual result sets may be limited to 1,000 objects based on the underlying object storage’s limitations.
The metadata that describes an object.
Contains the configuration for how to respond to server errors
Enums
Traits
Universal API to multiple object store services.
Type Definitions
An alias for a dynamically dispatched object store implementation.
Id type for multi-part uploads.
A specialized Result
for object store-related errors